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© Fragments of DNA which encode peptides capable of Inducing In vivo the synthesis of anti-hepatitis A virus antibodies 

Recombinant DNA fragments are provided, and their 
nucleotide sequences determined which encode the anti- 
genic determinants responsible for the immunogenicity and 
the immunological specificity of various proteins of Hepatitis 
A virus (hereinafter referred to as HAV), including the VP-1 
protein, the main structural protein of HAV. 
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TITLE OF THE INVENTION 

FRAGMENTS OF DNA WHICH ENCODE PEPTIDES 
CAPABLE OF INDUCING IN VIVO THE SYNTHESIS OF ANTI- 
HEPATITIS A VIRUS ANTIBODIES. 



BACKGROUND OF THE INVENTION 

Hepatitis A is a liver disease which, 
although not commonly fatal, can induce long periods 
of debilitating illness. The disease is commonly 
spread by direct contact with an infected individual 
or by hepatitis A virus (HAV) contaminated drinking 
water and/or food. 

The prior art does not identify the protein 
or proteins of HAV (hepatitis A virus) which induce 
neutralizing antibodies to this virus. Or* of the 
major drawbacks to examining the protect!^ 



015458Z 



l 7 67P/0072B 



_ 2 - 17011IA 



10 



antigenicity of HAV proteins has been the lack of 
sufficient quantities of HAV and its polypeptide 
components. The virus is made in very small 
quantities in cell culture, has a limited animal host 
range, and is difficult to purify from infected cell 
cultures and animal tissues. Recently a patent 
application claiming the VP-1 structural protein of 
HAV prepared by isolation from the virus was filed by 
the assignee of this application and is presently 
copending as U.S. S.N. 541,836, filed October 14, 
1983. VP-1 is recognized in our laboratories as 
being the main structural protein of HAV, and 
accordingly the most important for vaccine use. The 
appropriate dosage forms and regions are in this 
15 application, which is incorporated by reference. 

Cloning of the genomic material of HAV and 
potential analysis of its sequence has recently been 
reported: Von"Der Helm et al., J. Virological 
Methods 3 1981, 37-43; see also EPO 0061740, 
priority March 28, 1981 published October 6, 1982 
disclosing the same work; and Ticehurst et al., Proc. 
National Academy Science USAVol. 80, pgs 5885-5889. 

October 1983. 

The Von de Helm work describes preparation 

25 of cDNA and cloning of that cDNA into plasmids, and 
subsequent expression of HAV antigen, but no 
nucleotide sequence of the DNA or amino acid sequence 
of the corresponding proteins is described, nor is 
the identity of any of the viral proteins expressed 

30 known. In fact, the weak antigenic responses of the 
viral proteins obtained by Von de Helm et al. tend to 
indicate that most portions of the HAV genomic 
encoding the important antigenic proteins of HAV has 
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not been successfully identified or used in the 
cloning experiments. 

Ticehurst does report a partial nucleotide 
sequence from the 3* terminus of HAV, but does not 
5 appear to have identified that part of the sequence 
which encodes for the antigenic proteins. From our 
experiments, we believe that Ticehurst has not 
sequenced the antigenic portion; that the Ticehurst 
sequence work is far outside the antigenic region. 

10 

OBJECTS OF THE INVENTION 

It is an object of the present invention to 
provide a method for cloning the genome of HAV 
encoding for the major antigenic proteins of HAV, 

15 including VP-1, VP-2, VP-3 and VP-4^ especially 

VP-1. Another object is to determine the nucleotide 
sequences of the HAV genome, including those 
sequences which encode the antigenic proteins, 
including VP-1. Yet another object is to provide a 

20 method for producing VP-1, VP-2, VP-3, VP-4, or parts 
thereof, by expression of cloned DNA in an 
appropriate host. Still another object is to provide 
a vector containing the VP-1, VP-2, VP-3 or VP-4 
genes, or parts thereof, but especially a vector 

25 containing the VP-1, or part of VP-1, gene. Another 
object is to provide transformed hosts containing a 
vector containing the VP-1, VP-2, VP-3 or VP-4 genes, 
or parts thereof and being capable of expressing the 
peptide encoded by said genes or parts thereof. 

30 These and other objects of the present invention will 
be apparent from the following description. 
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SUMMARY OF THE INVENTION 

Fragments of DNA which code for immunogenic 
peptides VP-1, VP- 2, VP-3 and VP-4 capable of 
inducing in vivo the formation of anti-Hepatitis A 
5 virus antibodies are provided and their nucleotide 
sequence determined. In other terms, the invention 
concerns fragments of DNA which code for the 
antigenic determinants on peptides normally encoded 
by the RNA of HAV or by corresponding double-stranded 
10 cDNA; these antigenic determinants being essential 
for the antigenic properties of the products of 
natural viral RNA expression. 

DETAILED DESCRIPTION 

15 • RNA was extracted from purified hepatitis A 

virus particles, annealed to a dT-tailed plasmid, and 
the complementary cDNA synthesized. This single 
stranded DNA was used as a template for synthesis of 
its complementary strand using DNA polymerase. The 

20 double stranded cDNA thus formed corresponded to at 
least a portion of the HAV antigenic protein genes. 
It was modified to provide "sticky ends" and placed 
into an appropriate vector; suitable hosts include 
prokaryotic and eukaryotic organisms. These hosts 

25 were exposed to the resulting vector and those which 
stably incorporated the vector were identified and 
isolated . 

Vector DNA was extracted from these hosts 
and this material was characterized. At least 5 
30 fragments of loned HAV DNA were sequenced to 

determine the nucleotide sequence of the region 
encoding the antigenic peptides, VP-1, VP-2, VP-3 and 
VP-4. The corresponding amino acid sequence encoded 
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thereby was inferred. 

Knowledge of the nucleotide sequences 
encoding the amino acid sequences of these peptides 
makes possible their synthesis as well as subunits 

5 and homologues and analogues thereof thus allowing 
determination of the relationship between structure 
and function. The peptides, which as noted above , do 
not form part of this invention r can be used in 
pharmaceutical compositions to make vaccines. 

10 The HAV peptides of the present invention 

may be prepared from their constituent amino acids by 
standard methods of protein synthesis, e.g., 
Schroeder et al . , "The Peptides", Vol. I, Academic 
Press, 1965, or Bodanszky et al . , "Peptide 

15 Synthesis", Interscience Publishers 1966, or McOmie 
(ed.), "Protective Groups in Organic Chemistry", 
Plenum Press 1973, the disclosures of which are 
hereby incorporated by reference. 

The peptides of HAV also may be prepared by 

20 recombinant DNA techniques by, for example, the 

isolation or preparation of appropriate DNA sequences 
and incorporation of these sequences into vectors in 
a suitable host and expression of the desired peptide 
therefrom. The use of recombinant DNA techniques is 

25 described in many published articles, for example, 
Maniatis et al. , Molecular Cloning, A Laboratory 
Manual, Cold Spring Harbor, New York 1982, the 
disclosure of which is hereby incorporated by 
reference. Modification of the nucleotides coding 

30 for the peptides of the present invention according 
to known techniques permits the preparation via 
recombinant DNA techniques of peptides having altered 
amino acid sequences of the peptides of the present 
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invention. Certain of these amino acids may be coded 
for by more than one triplet nucleotide sequence. It 
is to be understood that the disclosed nucleotide 
sequence is to include other codons coding for the 

5 same amino acid, e.g., the codons CGA and AGA each 
code for arginine. 

Suitable hosts for expression of the HAV 
peptides include prokaryotic organisms such as B. coli 
and B. subtilis , and eukaryotic organisms such as 

10 Saccharomyces cerevisiae and Chinese hamster ovary 
cells. It is also to be understood that these 
proteins can be expressed directly in a mammalian 
species by means of appropriate expression vectors 
such as vaccinia, varicella zoster, adeno or herpes 

15 simplex viruses. 

The following examples illustrate the 

present invention. 

EXAMPLE 1 

20 VIRUS RNA EXTRACTION 

1.1 Hepatitis A virus particles (strain 
CR326) , purified from virus-infected LLC-MK2 cells (a 
monkey kidney cell line) by CsCl gradient 
centrifugation, were disrupted at 65°C with 1% SDS, 

25 20 mM EDTA and phenol extraction using 0.1M tris pH 
7.4 - saturated phenol. 

1.2 The aqueous layer of the extraction was 
further extracted twice with equal volumes of 
CHC1 3 ; isoamyl alcohol (24:1) and precipitated with 

30 0.2 M sodium acetate pH 5.5 and 2 volumes of EtOH at 
-20°C. 

1.3 The EtOH precipitate was collected by 
centrifugation and dissolved in H 2 0. 
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EXAMPLE 2 

PREPARATION OF cDNA CLONES FROM THE 3' END OF THE 
VIRAL GENOME 

The initial 3 1 cDNA clones were prepared 
5 using the procedure of H. Okayama and P. Berg 

( Molecular and Cellular Biology 2; 161*170 (1982)). 
In brief, the viral RNA was annealed to dT-tailed 
plasmid (derived from pSV 0.71-0.86) and cDNA 
synthesized by reverse transcriptase using the 
10 dT-tailed linker (derived from pSV 0.19-0.32). Using 
the linker as a primer, the second cDNA strand was 
synthesized by DNA polymerase after removal of the 
RNA by RNase H. The cDNA was ligated and transformed 
into E. coli . The resulting clones were selected by 
15 resistance to ampicillin and screened by hybridization 
to a 
RNA. 



32 

to a P-labeled HAV cDNA prepared from the viral 



The largest viral insert obtained was 
approximately 2.3Kb in size and was called clone a 

20 18. Restriction enzyme analysis of this clone 

demonstrated the presence of two Pvu II sites near 
the 5' end of the cloned insert. The cloned DNA was 
restricted with Pvu II and the 280 bp 5' proximal 
fragment was purified by agarose gel electrophoresis, 

25 visualized by ethidium bromide staining and 

electroeluted in dialysis tubing for use as a primer 
for further cDNA cloning. 

EXAMPLE 3 

30 PREPARATION OF cDNA CLONES BY PRIMER EXTENSION 

3.1 The 280 bp Pvu II restriction fragment 
of cDNA clone a 18 was denatured by boiling for 10 
min. and quick-cooling in ice-water, and then used as 
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a primer for HAV cDNA synthesis. The denatured 
primer was either annealed or hybridized to the viral 
RNA. For annealing, 160 ng of denatured primer was 
added to approximately 1 pg of purified HAV RNA in 20 

5 pi of E 2 0 and heated to 70°C for 10 min. then 37°C 
for 15 min. For the hybridization, 140 ng of 
denatured primer was added to 1 yg of purified HAV 
RNA in 80% formamide, 0.4 M NaCl, 0.01 M PIPES pH 
6.4, and 2 mM EDTA and incubated at 47°C for 3.5 

10 hours. The formamide was removed by performing 4 
sequential ethanol precipitations. 

3.2 After annealing or hybridization of the 
primer to the RNA, the first strand of cDNA was 
synthesized using 50 mM Tris pH 8, 0.34 mM dCTP, 1 mM 

15 dGTP, dATP, dTTP, 10 mM 2-mercaptoethanol, 10 mM 
magnesium acetate, [ 32 P-a]dCTP <8 y Ci/10 yl) , 
RNasin (7 units/10 yl) , and AMV reverse transcriptase 
(16 units/10 pi) by incubating at 42°C for 30 min. 
The reaction mix was extracted twice with phenol, 

20 chloroform extracted, and precipitated with ethanol. 

3.3 The second strand of cDNA was then 
synthesized using 20 mM Tris pH 7.4, 5 mM MgCl 2 , 10 
mM (NH 4 ) 2 S0 4 , 100 mM KC1, 100 yg/ml BSA, 0.04 

mM dATP, dGTP, dCTP and dTTP, 9 units/ml RNase H, and 
25 1750 units/ml DNA polymerase I by incubating at 12 °C 
for 60 min. and then 22°C for 60 min. The reaction 
mix was extracted twice with phenol and chloroform 
and ethanol precipitated. 

3.4 The double-stranded cDNA was dC-tailed 
30 (<sing terminal deoxynucleotidyl transferase, annealed 

to pBR 322 DNA which was dG-tailed at the PstI site, 
and transformed into competent E. coli strain RR-1. 
The resulting clones were selected for tetracycline 
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resistant, ampicillin-sensitive growth and screened 
for positive hybridization to a 32 P-labeled cDNA 
prepared from HAV RNA. 

5 EXAMPLE 4 

VERIFICATION OF HAV- SPECIFICITY OF cDNA CLONES 

To insure the cDNA clones were generated 
from Hepatitis A genetic inforination r the clones were 
labeled with [ 32 P-a] dCTP by nick translation and 

10 hybridized to uninfected and HAV-infected LLC-MK2 
cellular RNA bound to nitrocellulose membrane 
filters. The cDNA clones hybridized only to the 
HAV- infected cell RNA. 

15 EXAMPLE 5 

DETERMINATION OF THE RELATIVE POSITION OF THE CLONED 
cDNAs AND ANALYSIS OF THE BASE SEQUENCE 

The cloned cDNAs were analyzed by restriction 
enzyme digestion and by cross-hybridization of the 

20 clones to one another in order to locate the relative 
positions of five clones, T28-18, T28-123, T28-94, 
T28-71 and T28-77, as shown in Fig. 1. From this 
analysis one can deduce that over 90% of the viral 
genome has been cloned in overlapping cDNA clones 

25 starting at the 3' end of the genome. The 

restriction enzyme map of the HAV genome is that line 
marked -HAV" in Fig. 1. The location of the 
structural proteins on the polio genome is also shown 
in Fig. 1 as an indication of the probable predicted 

30 position on the HAV gefcOme, since polio and HAV are 
related viruses. 
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EXAMPLE 6 

ANALYSIS OF THE BASE SEQUENCE OF THE cDNA CLONES 
DNA sequencing has been performed by two 

methods. 

5 Chemical DNA sequencing has been performed 

by the Maxam and Gilbert technique on clones T28-18, 

T28-123, T28-77, T28-94 and T28-71. 

Subclones of fragments of cDNA clone T28-77 

were prepared in M13 phage and sequenced by the 
10 dideoxynucleotide termination procedure described by 

Sanger. 

In all, a region of 3000 contiguous bases 
have been sequenced. This region contains the 
sequences which encode the two structural proteins 

15 VP-1 and VP-3 of HAV; and the other proteins VP-2, 
and VP-4. The proteins VP-1 and VP-3 are the major 
antigenic proteins; VP-2 and VP-4 are thought to be 
less valuable as antigens. 

The amino acid sequence of a mixture of 

20 cyanogen bromide cleavage fragments generated from 
purified VP-3 protein (covered in copending DSSN 
541 , 836 , supra) has been compared to the translation 
of the DNA sequence to more precisely locate the 
region which encodes the VP-3. 

25 The amino acid sequence of a mixture of 

cyanogen bromide cleavage fragments generated from 
purified VP-1 protein (covered in copending USSN 
541,836, supra) has been compared to the translation 
of the DNA sequence to more precisely locate the 

30 region which encodes the VP-1 p )tein. 

Referring to Fig. 1, the restriction map for 
the HAV genome indicates the location of the 
following 4 restriction sites: the Hinc II cleavage- 
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site at base 1643 of the DNA sequence; the Hind III 
cleavage site at base 1744; the Bam HI cleavage site 
at base 2049; and the Pvu II cleavage site at base 
2722. These genome map positions are in the 5* to 3 1 
5 direction on the genome: The nucleotide sequence of 
the region encoding the structural proteins, together 
with the reading frame for the protein is as follows: 



27 54 

10 6AT STB TBG BAC 6TC ACC TT6 CAB T6T AAA CTT BBC TCT CAT 6AA CCT CTT TBA 
Asp Val Trp Asp Val Thr Leu Bin Cys Lys Leu Bly Ber His 61 u Pro Leu . 

ei iob 

TCT TCC ACA ABB BGT ABB CTA CGG STB AAA CCT CTT ABB CTA ATA CTT CTA TBA 
Ber Ber Thr Arg Bly Arg Leu Arg Val Lye Pro Leu Arg Leu lie Leu Leu • 

- - 135 162 

15 ABA BAT BCT TT6 6AT ABB BCA ACA BCB BCB BAT ATT BBT BAG TTB TTA ABA CAA 
Arg Asp Ala Leu Asp Arg Ala Thr Ala Ala Asp lie Bly 61 u Leu Leu Arg Bin 

1B9 216 
AAA CCA TTC AAC BCC BBA BSA CT6 BCT CTC ATC CAB TBB ATB CAT TBA BTB BAT 
Lys Pro Phe Asn Ala Gly Bly Leu Ala Leu He 61 n Trp MET His - Val Asp 



20 



243 270 
TBA TTB TCA B6G CTG TCT CTA BGT TTA ATC TCA BAC CTC TCT BTB CTT ABB BCA 
. Leu Ber Bly Leu Ber Leu Bly Leu He Ber Asp Leu Ber Val Leu Arg Ala 

297 324 
AAC ACC ATT TBG CCT TAA ATB BBA TCC TGT BAB ABB BBB TCC CTC CAT TBA CAB 
Asn Thr He Trp Pro . MET Gly Ber Cys Glu Arg Bly Ber Leu His • Gin 

25 351 37B 

CT6 BAC TBT TCT TTB BBB CCT TAT BTG BTG TTT BCC TCT BAG 6TA CTC AGG G6C 
Leu Asp Cys Ber Leu Bly Pro Tyr V*l Val Phe Al* Ber 61 u Val Leu Arg Bly 

405 432 
ATT TAG 6TT TTT CCT CAT TCT TAA ACA ATA ATG AAT ATB TCC AAA CAA EGA ATT 
He . Val Phe Pro His Ber . Thr He MET Asn MET Ser Lys Bin Bly He 



30 459 4B6 

TTC CAB ACT BTC BBB AGT BGC CTT BAC CAC ATC CTB TCT TTG BCA BAT ATT BAG 
Phe Bin Thr Val Bly Ser Bly Leu Asp His He Leu Ber Leu Ala Asp He Blu 
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513 540 



GAA GAG CAA AT6 ATT CAB TCC GTT 6TT ABB ACT BCA ST6 ACT 8QT BCT TCT TAT 
eTu 61u 61n MET lie Gin Ber V.l VI Arg Thr Al. V.l Thr Bly Al. Ber Tyr 



25 



567 594 



TTT ACT TCT BTB GAC CAA TCT TCA BTT CAT ACT BCT BAB GTT BBC TTA CAT CAA 
£he Shr Ser V.l Asp Bin Ser Ser V.l Hi- Thr Al. Blu V.l Bly Leu H,s Bin 



621 648 



ATT BAA CCC TTG AAA ACC TCT BTT BAT AAA CCT AST TCT AAG AAG ACT CAB 6GB 
lie Blu Pro Leu Lys Thr Ber V.l Ab P Lys Pro Ser Ber Lys Lys Thr Bin Gly 



675 702 



GAB AAG TTT TTC CTG ATT CAT TCT BCT BAT TBG CTC ACT ACA CAT BCT CTA TTT 
til Phe Phe Leu lie His Ser Al. Asp Trp Leu Thr Thr His Al. Leu Phe 



729 756 



CAT BAA GTT BCA AAA TTB 6AC GTG BTB AAA TTA TTG TAT AAT BAG CAB TTT 6CC 
"s Glu V.l Al. Lys Leu Asp V.l V.l Lys Leu Leu Tyr Asn Blu Bin Phe Al. 



783 010 



GTC CAA GET TTG TTG ABA TAC CAC ACA TAT BCA ABA TTT BBC ATT BAG ATT CAA 
V Il BTn lly Leu Leu Arg Tyr His Thr Tyr Al. Arg Phe Bly lie Blu lie Bin 



837 864 



GTT CAS ATA AAT CCC ACA CCC TTT CAE CAA EGG 6GG CTA ATT T6T 6CT ATG BTT 
v£ Gin -He £n P^ Thr Pro Phe Bin Bin Bly Gly Leu lie Cy. Al. MET V.1 



891 918 



CCT AGT SAC CAA ABT TAT GGT TCB ATA BCA TCC TTG ACT BTT TAT CCT CAT BGT 
Pro Ser Asp Bin Ser Tyr Gly Ser He Al. Ser Leu Thr V*l Tyr Pro His Bly 



945 - 972 



TTG TTA AAT TGC AAC ATT AAC AAT BTG BTT ABA ATA AAG BTT CCA TTT ATT TAT 
Leu Leu Asn Cys Asn He Asn Asn V.l V.l Arg lie Lys V.l Pro Phe U. Tyr 
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999 1026 
ACT ABA GGT BCT TAT CAC TTT AAG GAT CCA CAG TAT CCA GTT TGG GAA TTA ACA 
Thr Arg 61 y Ala Tyr His Phe Lys Asp Pro Bin Tyr Pro Val Trp Glu Leu Thr 

1053 10B0 
ATC AGA GTT TGG TCA GAG TTB AAT ATT 6GA ACA BGA ACT TCA BCT TAC ACT TCA 
lie Arg V»l Trp Ber Glu Leu Asn I1b 61 y Thr Bly Thr Ber Ala Tyr Thr Ber 

1107 1134 
CTT AAT GTT TTA BCT AGG TTT ACA GAT TTG BAG TTA CAT GGA TTA ACT CCT CTT 
Leu Asn Val Leu Ala Arg Phe Thr Asp Leu Blu Leu His Bly Leu Thr Pro Leu 

1161 1166 
TCT ACA CAB ATB ATG ABA AAT GAA TTT AGA 6TT A6T ACT ACT GAA AAT GTT BTA 
Ser Thr Gin MET MET Arg Asn Glu Phe Arg Val Ber Thr Thr Glu Asn Val Val 

1215 1242 
AAT TT6 TCG AAT TAT BAA BAT BCA AGG BCA AAA ATG TCT TTT BCT TTG GAT CAG 
Asn Leu Ber Asn Tyr Blu Asp Ala Arg Ala Lys MET Ber Phe Ala Leu Asp Bin 

1269 1296 
BAA BAT TGG AAG TCT BAT CCT TCC CAA 6GT 66T GGA ATT AAA ATT ACT CAT TTT 
20 01 u A *P Tr P *-y« 5 « r Afi P Pro 6er ein e *V B1 Y 61 Y 11 e L Y* He Thr His Phe 

1323 1350 
ACT ACC TGG ACA TCC ATT CCA ACC TTA BCT BCT CAG TTT CCA TTC AAT GCT TCA 
Thr Thr Trp Thr Ber He Pro Thr Leu Ala Ala Bin Phe Pro Phe Asn Ala Ser 

1377 1404 
GAT TCG GTT BGA CAA CAA ATT AAA GTT ATT CCA BTG GAC CCA TAT TTT TTC CAG 
25 Asp Ser Val Bly Gin Gin He Lys Val He Pro Val Asp Pro Tyr Phe Phe Gin 

1431 145B 
ATG ACA AAC ACC AAT CCT BAT CAA AAB TGT ATA ACT BCC TTG GCT TCT ATT TGT 
MET Thr Asn Thr Asn Pro Asp Gin Lys Cys He Thr Ala Leu Ala Ber He Cys 
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14B5 1512 



CAB ATG TTT TEC TTT TSG ASG BGA BAT CTT GTT TTT BAT TTT CAG GTT TTT CCA 
Gin ^T Phe Cys Phe Trp Arg Bly Asp Leu Val Phe Asp Phe Bin V.l Phe Pro 



1539 1566 



ACC AAA TAT CAT TCA GST AGG TT6 TTG TTT TEC TTT BTT CCT BBG AAT BAG TT6 
Thr Lys Tyr His Ser Bly Arg Leu L.u Phe Cys Phe V.l Pro Bly A.n Biu Leu 



1593 1620 



ATA BAT GTT ACT BGA ATC ACA TTA AAA CAG SCA ACC ACT SCT CCT TGT 6CA BTG 
lit Asp V.l Thr Bly lie Thr Leu Lys Bin Al. Thr Thr Al. Pro Cys Al. V.l 



1647 1674 



ATG GAC ATT ACA BBA BTG CAG TCA ACC TTG AGA TTT CBT BTT CCT TG6 ATT TCT 
MET Asp lie Thr Bly V.l Gin Ber Thr Leu Arg Phe Arg V.l Pro Trp lie Ser 



1701 1728 



GAT ACA CCC TAT CGA BTG AAT AGG TAC ACG AA6 TCA GCA CAT CAA AAA 6GT GAG 
tel Thr £ro Tyr Arg V.l Asn Arg Tyr Thr Lys Ser Al. Hi. Gin Lys Bly Blu 



1755 1782 



TAT ACT 6CC ATT BBG AAG CTT ATT BTB TAT TGT TAT AAT ABB CT6 ACT TCT CCT 
Tyr Thr Al. He Gly Lys Leu He V.l Tyr Cys Tyr Asn Arg Leu Thr Ser Pro 



1809 1B36 
TCT AAT BTT SCT TCT CAT BTT AGA GTT AAT BTT TAT CTT TCA BCA ATT AAT TTG 
£ ton vll nl SeV His V.1 Arg V.l Asn V.l Tyr Leu Ber Ala lie Asn Leu 

1863 1890 
BAA TGT TTT BCT CCT CTT TAT CAT GCT ATG BAT BTT ACC ACA CAG BTT 6GA GAT 
25 61u Cys Phe Al. Pro Leu Tyr His Al. MET Asp V.l Thr Thr Gin V.l Gly Asp 

1917 1944 
BAT TCA GGA BGT TTT TCA ACA ACA BTT TCG ACA BAG CAG AAT GTT CCT 6AT CCC 
Asp Ser Gly Bly Phe Ser Thr Thr V.l Ser Thr Blu Bin Asn V.l Pro Asp Pro 



30 
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1971 199B 
CAA GTT B6T ATA ACA ACT ATB AAG 6AC CTG AAA 636 AAA BCC AAT ABB GSA AAG 
Bin Val Bly He Thr Thr MET Lys Asp Leu Lys 61 y Lys Ala Asn Arg 61 y Lys 



2025 2052 
ATB BAT BTT TCA 66A 6TB CAA 6CA CCT BTB 66A 6CT ATC ACA ACA ATT SAB BAT 
20 M£ T Asp Val Ser Bly Val Bin Ala Pro Val Bly Ala He Thr Thr lie Glu A«p 



2079 2106 
CCA QCA TTA BCA AAG AAA 6TA CCT BAA ACG TTT CCT BAA TTB AAB CCT GGA GAG 
Pro AU Ltu Ala Lys Lys Val Pro Glu Thr Phe Pro Glu Leu Lys Pro Gly Glu 



2133 2160 
TCT ABA CAT ACA TCA BAT CAC ATB TCT ATT TAT AAA TTC ATB BGA AGG TCT CAT 
15 Ber Arg His Thr Ber Asp His MET Bar He Tyr Lys Phe MET Bly Arg Ser His 



21B7 2214 
TTT TTG TGT ACT TTT ACC TTC AAT TCA AAT AAT AAA BAG TAC ACA TTT CCA ATA 
Phe Leu Cys Thr Phe Thr Phe Asn Ser Asn Asn Lys Glu Tyr Thr Phe Pro XI e 



2241 226B 
ACC TTG TCT TCG ACT TCT AAT CCT CCT CAT BGT TTA CCA TCA ACA TTA AGG TBG 
20 Thr Leu Ser Ser Thr Ber Asn Pro Pro His Bly Leu Pro Ser Thr Leu Arg Trp 



25 



2295 2322 
TTC TTC AAT CTB TTT CAB TTB TAT ABA BBA CCA TTG GAT TTG ACA ATT ATC ATC 
Phe Phe Asn Leu Phe Bin Leu Tyr Arg Bly Pro Leu Asp Leu Thr He lie He 

2349 2376 
ACA BGA BCT ACT GAT GTS BAT 6BA ATB BCC TGG TTT ACT CCA GTA GGC CTT 6CT 
Thr Gly Ala Thr Asp Val Asp Gly MET Ala Trp Phe Thr Pro Val Gly Leu Ala 



2403 2430 
GTT 6AC ACC CCA TGG GTB GAA AAG GAA TCA BCT TTG TCT ATT GAT TAT AAA ACT 
Val Asp Thr Pro Trp Val Glu Lys Glu Ber Ala Leu Ser He Asp Tyr Lys Thr 
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2673 — 
3*3 TTC TAT TTT CCT ABA GCT CCA TTA AAT TCA AAT OCT ATB TTG TCC ACT BAB 
Blu Phe Tyr Phe Pro Arg Al» Pro Leu Asn Ser Asn Al« nti ueu o«r 

2727 2754 
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2BJ= 2843 



2535 — 

r v : e e e e r u js s tr. e e s s e e e e e 



28B9 



TT6 TCA AAT BAA GTE CTT CCA CCT CCT ABB AAA ATB AAB BBB TTA TTT TCA CAA 
Leu Ser Asn Blu Val Leu Pro Pro Pro Arg LyB HET Lys Biy Leu r 
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2943' ' 2970 

6CC AAA ATX TCT CTT TTT TAT ACT BAG BAA CAT BAA ATA ATS AAA TTT LCO TBS 
Ala Lys lie Ber Leu Phe Tyr Thr Glu 61 u Hie Glu lie MET Ly« Phe Trp 

2997 . 

AGA B6A STG A 
Arg Ely V*l 
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Within this entire sequence is encoded the 
information necessary to make the antigenic proteins 
of HAV. The sequences encoding for the structural 
proteins begin at base 403. The VP-3 protein is 
encoded starting approximately at map position 1149. 
The VP-1 protein is encoded starting approximately at 
map position 1882. The key sub-unit sequences within 
VP-lr designated Sequences I, II , III/ IV r and V, 
start, respectively at 1882, 1963, 1999, 2146, 2347. 
An effective vaccine can contain any one or more of 
these subunits, or of any other effective peptide 
subunit encoded within the genome sequence. 

The value of the nucleotide sequence will be 
readily apparent to those skilled in the art: a 
particular nucleotide sequence encoding for an active 
peptide or peptide fragment can be synthesized and 
cloned into an expression system in order to produce 
the particular peptide or subunit desired. These 
nucleotide sequences can be either made by total 
synthesis, that is, by linking bases according to 
known techniques, or by direct isolation from the 
viral RNA or cDNA. If the start of the nucleotide 
sequence desired does not correspond to the 
particular peptide, known techniques which 
enzymatically add or subtract the particular bases to 
the nucleotide chain can be employed to precisely 
make the nucleotide sequence. 

Other nucleotide sequences which are 
valuable as encoding antigenic proteins are the 
sec ences from base 1749 to base 2722; from base 1487 
to base 2980 and from base 1644 to base 2722, all 
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contained within the above sequence. The sequence 
from base 1749 to base 2722 is especially valuable as 
a vector for producing antigen protein, see the 
following examples. 

5 

EXAMPLE 7 
EXPRESSION OF cDNA CLONES 

In order to begin studies on expression, a 
DNA clone containing the region of sequences encoding 
VP-3 and VP-1 in a single insert was first 
constructed. The DNA of clone 28-94 was cleaved with 
EcoRI and Xba I restriction enzymes and the 5.76 kb 
band of DNA containing the 5' portion of the viral 
structural protein genes was purified by agarose gel 
15 electrophoresis and electroelution. Similarly, the 
2.3 kb EcoRI-Xbal band containing the 3'portion of 
the viral structural genes of clone 28-77 was 
purified. These two fragments were ligated together 
to re-form pBR322 vector molecules containing the 
entire non-rearranged viral structural gene 
sequences. The ligated mixture was transformed into 
33 - coli strain RR-1 , thereby generating clone 57-5 
with a plasmid containing an insert of intact VP-3 to 
VP-1 information. 

25 The expression of HAV-specif ied proteins 

containing antigenic reactivity to anti-HAV 
antibodies can be accomplished in bacterial cells , 
yeast, or higher eucaryotic cells. Some of the 
possible methods to obtain expression of the cDNA 
clones into proteins or polypeptide products are 
described below. 

In the first example, fragments of HAV cDNA 
are inserted downstream from the bacterial lac 
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p.xaoter making a fusion product with the N-terminal 
f.r.ino acids of the bacterial B-galactosidase followed 
by HAV polypeptides. The fragments of HAV cDNA used 
having ends generated by Bgl II or Bam HI or Hind III 
or Pst I or Tag I or Sau 3A or Nco I are ligated into 
the appropriate sites of the known expression 
plasmids pUC 7, 8 f or 9 or pCQV2. The expression 
plasmid used with each HAV DNA fragment is selected 
to give an open reading frame for translation after 
insertion of the DNA. The recombinant expression 
plasmids are transformed into E. coli and appropriate 
clones are selected, grown to an optical density of 
0.8 at a wavelength of 550, and lysed. The cellular 
extract is then used for detection of HAV-specific 
antigenic peptides. 

Alternatively, the fragments of HAV cDNA are 
inserted downstream of the Herpes simplex viral 
thymidine kinase gene promoter in a plasmid, cloned 
in E. coli and the resulting cloned recombinant 
plasmid is transfected into mouse L cells, mouse 3T3 
cells, or other eukaryotic cells in culture. The 
fragments of HAV cDNA used having ends generated by 
Bgl II or Bam HI or Hind III or Bal 31 enzymes are 
ligated behind the thymidine kinase ATG translational 
start codon at the Rsa I site using appropriate 
linker molecules of DNA to make the connection in 
frame with respect to translation. The eucaryotic 
cells transfected by the recombinant expression 
plasmids are selected by cotransfection with the 
entire Herpes thymidine kinase gen and subsequent 
growth in HAT selection medium. The colonies of 
selected cells are grown and cellular extracts 
prepared for detection of HAV-specific antigenic 
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peptides. 

EXAMPLE 8 

DETECTION OF EXPRESSED HAV PROTEIN AND POLYPEPTIDES 
5 The cells expressing HAV proteins or 

polypeptides are labeled by growth in the presence of 

3 5 14 

S-labeled methionine or C-labeled leucine. 

Extracts of the cells are then prepared and reacted 

with anti-HAV antiserum or anti-HAV monoclonal 

10 antibodies. The antigen-antibody complexes are 

collected by precipitation with formalin fixed Staph 
A, dissociated by boiling in SDS-EDTA, and separated 
by SDS-polyacrylamide gel electrophoresis. The 
proteins are then visualized by autoradiography. 

15 Alternatively, the expressed HAV proteins or 

polypeptides are detected by a competition 
radioimmune assay. Purified HAV virion particles are 
labeled in vitro with 125 I and precipitated with 
anti-HAV antiserum or anti-HAV monoclonal 

20 antibodies. This precipitation is performed in the 
presence of increasing amounts of unlabeled extracts 
prepared from the expression cells. The 
antigen-antibody complexes are collected on filters 
and counted in a gamma-counter. The presence of 

25 expressed HAV products in the expression cells is 
shown by a decrease in counts as the amount of 
cellular extract is increased in the assay. 
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EXAMPLE 9 

EXPRESSION OF HAV SEQUENCES IN E. COLI 
C: n struction of Expression Plasmids 

(1) Approximately 0.5 pg of pCQV2 (Cary Queen) 
DNA was cleaved with BamHI and PvuII and treated with 
phosphatase and ligated to 2 vg of gel purified 675 
bp long fragment derived from clone T28-123 by 
cleavage with BamHI and PvuII. The ligated mixture 
was transformed into HB101 cells and ampicillin 
resistant clones were screened for the presence of 
insert with purified nick translated 675 base pair 
fragment. One positive clone designated pHAV-6 was 
assayed for production of HAV specified proteins. 
The junction sequence between pCQV2 and the insert 
was confirmed by cleavage with BamHI and PvuII which 
released the appropriate fragment. 

(2) Clone 57-5 (prepared by coupling cDNA clone 
28-94 with cDNA clone 28-77) was used get expression 
of almost the entire VP-1 gene and the C-terminal 
) region of VP-3. DNA from plasmid 57-5 was cleaved 
with Bglll and PvuII and 1.12 kbp fragment was 
purified by gel electrophoresis. Approximately 2 pg 
of the fragment was ligated to 0.5 pg of BamHI-PvuII 
cleaved pCQV2 DNA and the ligated mixture used to 
5 transform HB101 cells. Ampicillin resistant clones 
were screened by hybridization to nick translated 
BamHI-PvuII cleaved fragment. One of the clones 
which proved positive was designated pHAV-57-11 and 
was assayed for expression of VP3 and/or VPI. 
0 Both of these constructions allow read through 

from the initiating AUG codon within pCQV2 and 
continues beyond the end of the insert into the pCQV2 
sequences for 50 amino acids before a termination 



0154582 



2767P/0072B - 23 - 17011IA 

codon is encountered. The predicted open reading 
frame of pHAV-6 is therefore 275 amino acid residues 
and that of pHAV57-ll is 464 amino acid residues. 
Plasmid HAV57-11 contains the following nucleic acid 
5 sequences: 

1. GAT CTT GTT TTG ATT TTT CAG GTT TTT. 

2. GTT GGA GAT GAT TCA GGA GGT TTT TCA ACA ACA. 

10 

3. ATG AAG GAC CTG AAA GGG AAA GCC AAT AGG GGA AAG. 

4. ATG GAT GTT TCA GGA GTG CAA GCA CCT GTG GGA GCT 
ATC ACA ACA ATT GAG GAT CCA GCA TTA GCA AAG AAA GTA 

15 CCT GAA ACG TTT. 

5. ATG GGA AGG TCT CAT TTT TTG TGT ACT TTT ACC TTC 
AAT TCA AAT AAT AAA GAG TAC. 

20 6. ATG GCC TGG TTT ACT CCA GTA GGC CTT GCT GTT GAC 
ACC CCA. 

Plasmid HAV-6 contains the nucleic acid sequences 
identified by numbers 5 and 6 above, but not those 
25 sequences identified by numbers 1, 2, 3 and 4. 
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in* *^.»<™, of Expressed Proteins 

Expression plasmids pHAV-6 and pHAV57-ll were 
grown in L-broth to log phase and induced by heat 
shock at 42°C. The pCQV2 expression vector carries a 
5 temperature sensitive X repressor which suppresses 
synthesis of proteins inserted at the A rightward 
promoter (\P R ). Heating at 42°C inactivated the 
repressor and allowed the expression of genes located 
downstream from Ap r . 1 ml of cells were pelleted 
10 before induction and 30 minutes after induction, 
lysed by boiling with SDS mercaptoethanol and the 
proteins separated by discontinuous gel 
electrophoresis. The proteins were transferred 
either to nitrocellulose (BA-85) paper or to 
15 Millipore filters (HAHY grade) by electroblotting. 
The proteins that were transferred were probed with 
two polyclonal antihepatitis A antisera (antiHAV) 
designated DB-2 and C-149 raised by injection of . 
SDS-heat denatured virus into rabbits. ^ 
20 The antisera were in turn detected by I 

labelled protein A. Two criteria were used to 
determine expression of HAV specified proteins: (1) 
Heat inducibility, and (2) Appearance of a novel band 
absent in the original vector pCQV2 when probed with 
25 post- immune serum. Plasmid HAV-6 showed a unique 

band at ^ 30 and 20 k daltons when probed with DB-2 
serum and a band at 30 kd when probed with C149 post 
immune serum. Since pHAV-6 contains an insert 
entirely within VP-1 coding region, expression of 
30 VP-1 segments in E. coli was achieved. 
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Plasmid HAV57-11 showed six prominent bands when 
probed with DB-2 serum. They are, in order of 
decreasing size, /^50, 35, 23, 20, 16, and 14 k 
daltons. Presumably, the smaller peptides represent 

5 degradation products of the *^50 kd protein. The 

profile is simpler when the probe used was C149. The 
50 kd band was clearly visible as well as two weak 
bands around 30 k daltons. The small molecular 
weight protein bands were presumably degradation 

10 products. Since pHAV57-ll contains a DNA insert 

which encodes 133 amino acids of VP-3 representing 
the C-terminus as well as most of VP-1 sequences 
excluding »j 30 amino acids from the C-terminal end 
expression of VP-1 segments and VP-3 segments in 

15 E. coli was achieved. 

Screening with Marmoset Serum 

Pre- and postimmune serum from a marmoset 
injected with native virus were used to screen the 
protein blots. Preimmune serum failed to bind to any 

20 novel induced protein in both pHAV-6 and pHAV57-ll. 
Postimmune serum on the other hand reacted with a 
protein *s 50 kd in size produced in induced pHAV57-ll 
cells and another ^ 18k daltons in size. The 
detection of /v>50 kd protein in pHAV57-ll suggests 

25 that at least some of the same determinants or 

epitopes seen on the native virus are also available 
for interaction with antibody in the 50 kd protein. 
The presence of <rv 133 amino acid residues of VP-3 
sequences at the N-terminus of VP-1 presumably allows 

30 the VP-1 protein to assume a tertiary structure which 
exposes at least some of the determinants normally 
exposed on native virus. Hence the 50 kd protein 
and possibly the />^18 kd protein made in E. coli are 
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potential immunogenes which may be capable of 
eliciting normal immune response. 
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WHAT IS CLAIMED IS : 

1. A nucleotide sequence coding for EAV 
antigenic protein or sub-unit protein having the 
5 sequence, or any other- nucleotide sequence having at 
least one different codon that codes for the same 
peptide: 
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1490 


1500 


1510 


1520 


1530 


GAGATCTTGT 


TTTTGATTTT 


CAGGTTTTTC 


CAACCAAATA 


TCATTCAGGT 


1540 


1550 


1560 


1570 


1560 


AGGTTGTT6T 


TTTGCTTTGT 


TCCTGGGAAT 


GAGTTGATAG 


ATGTTACTGG 


159Q 


1660 


1610 


1620 


1630 


AATCACATTA 


AAACAGGCAA 


CCACTGCTCC 


TTGTGCAGTG 


ATGGACATTA 


1640 


1650 


1660 


1670. 


1680 


CAGGAGTGCA 


GTCAACCTTG 


AGATTTCGT6 


TTCCTTGGAT 


TTCTBATACA 


1690 


1700 


1710 


1720 


1730 


CCCTATCGAG 


TGAATAGGTA 


CACGAAGTCA 


GCACATCAAA 


AAGGTGAGTA 


1740 


1750 


1760 


1770 


1780 


TACTGCCATT 


GGGAAGCTTA 


TTGTGTATTG 


TTATAAtAGG 


CTGACTTCTC 


1790 


1800 


1B10 


1820 


1830 


CTTCTAAT6T 


TGCTTCTCAT 


GTTAGAGTTA 


ATGTTTATCT 


TTCAGCAATT 


1840 


1650 


1B60 


1870 


1880 


AATTTGGAAT 


GTTTTGCTCC 


TCTTTATCAT 


6CTATGGATG 


TTACCACACA 


1690 


1900 


1910 


1920 


1930 


G6TT8GAGAT 


BATTCABGAG 


GTTTTTCAAC 


AACAGTTTCG 


ACAGAGCAGA 


1940 


1950 


1960 


1970 


1980 


ATBTTCCTGA 


TCCCCAAGTT 


GGTATAACAA 


CTATGAAGGA 


CCTGAAAGGG 


1990 


2000 


2010 


2020 


2030 


AAAGCCAATA 


G6GGAAAGAT 


GGATBTTTCA 


GGAGTBCAAG 


CACCTGTGGG 


2040 


2O50 


2060 


2070 


2080 


AGCTATCACA 


ACAATTGAGG 


ATCCAGCATT 


AGCAAAGAAA 


GTACCT6AAA 


2090 


2100 


2110 


2120 


2130 


CGTTTCCTBA 


ATTGAAGCCT 


GGAGAGTCTA 


GACATACATC 


AGATCACATG 


2140 


2150 


2160 


2170 


2180 


TCTATTTATA 


AATTCATGGG 


AAGGTCTCAT 


TTTTTGTGTA 


CTTTTACCTT 


2190 


2200 


2210 


2220 


2230 


CAATTCAAAT 


AATAAAGAGT 


ACACATTTCC 


AATAACCTTG 


TCTTCGACTT 


2240 


2250 


2260 


2270 


2280 


CTAATCCTCC 


TCATGGTTTA 


CCATCAACAT 


TAAGGT6GTT 


CTTCAATCTG 


2290 


2300 


2310 


2320 


2330 


TTTCAGTT6T 


ATAGAGGACC 


ATTGGATTTG 


ACAATTATCA 


TCACAGGAGC 
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2340 *350 2360 2370 2360 

TACCBAT6TG ©ATBGAATGG CCTBQTTTAC TCCAGTAGGC CTTSCTGTTB 

2390 2400 2410 2420 2430 

ACACCCCATB BST6GAAAAG 6AATCAGCTT TBTCTATTBA TTATAAAACT 

2440 2450 2460 2470 24B0 

GCCCTTGGAG CTGTTAGATT TAATACAAGA AGAACAGGGA ACATTCAGAT 

2490 2500 2510 2520 2530 

TAGATTGCCA T6GTATTCTT ATTTATATGC TGTGTCTGGA GCACTGGATG 

2540 2550 2560 2570 2560 

GCTTBGGA6A TAAGACAGAT TCTACATTTB GATTGSTTTC CATACA8ATT 

2590 2600 26X0 2620 2630 

6CAAATTACA ACCACTCTGA TGAATATTTG TCCTTTAGTT 6TTATTTGTC 

2640 2650 2660 2670 2680 

TGTCACACAA CAATCAGAGT TCTATTTTCC TAGAGCTCCA TTAAATTCAA 

1 5 2690 27O0 2710 2720 2730 

ATGCTATGTT GTCCACTGAG TCTATBATBA GTAGAATTGC AGCTB8A6AC 

2740 2750 2760 2770 2760 

TTGGAGTCAT CAGTGGATGA TCCTAGATCA 6ABGAAGACA 6AAGATTTGA 

2790 " 2B00 2610 2820 2B30 

BAGTCATATA GAATGTAGGA AACCATATAA AGAATTGAGA TTGGAGGTTG 

~ n 2640 2650 2B60 2670 2880 

^ U 66AAACAAAG ACTTAAATAT 6CTCAGGAAG AGTTGTCAAA T6AAGTBCTT 

2690 2900 2910 2920 2930 

CCACCTCCTA BGAAAATGAA GGGGTTATTT TCACAAGCCA AAATTTCTCT 

2940 2950 2960 2970 2960 

TTTTTATACT GAGGAACATG AAATAAT6AA ATTTLCOTGG AGAGGAGTGA 

25 
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2. The nucleotide sequence within that 
claimed in Claim 1, which are: 

GAT CTT GTT TTG ATT TTT CAQ GTT TTT, 
GTT. GGA GAT GAT TCA GGA GGT TTT TCA ACA ACA , 
5 ATG AAG GAC CTG AAA GGG AAA GCC AAT AGG GGA AAG , 

ATG GAT GTT TCA GGA GTG CAA GCA CCT GTG GGA GCT ATC, 
ACA ACA ATT GAG GAT CCA GCA TTA GCA AAG AAA GTA CCT, 
GAA ACG TTT , 

ATG GGA AGG TCT CAT. TTT TTG TGT ACT TTT ACC TTC AAT, 
10 TCA AAT AAT AAA GAG TAC and 

ATG GCC TGG TTT ACT CCA GTA GGC CTT GCT GTT GAC ACC, 
CCA. 

3. Vectors containing all or part of the 
15 nucleotide sequences of Claim 2 and adapted to 

express in a suitable host the peptide coded for by 
said nucleotide sequence. 

4. The vectors of Claim 3 wherein the host 
20 is a prokaryotic or eukaryotic organism. 
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5. A suitable prokaryotic or eukaryotic 
host containing a vector according to Claim 4, the 
host adapted to express at least part of the peptide 
coded for by said nucleotide sequence. 

6. Plasmid HAV 57-11 containing the nucleic 
acid sequences 

GAT CTT GTT TTG ATT TTT CAG GTT TTT, 
GTT GGA GAT GAT TCA GGA GGT TTT TCA ACA ACA, 
10 ATG AAG GAC CTG AAA GGG AAA GCC AAT AGG GGA AAG, 

ATG GAT GTT TCA GGA GTG CAA GCA CCT GTG GGA GCT ATC, 
ACA ACA ATT GAG GAT CCA GCA TTA GCA AAG AAA GTA CCT, 
GAA ACG TTT , 

ATG GGA AGG TCT CAT TTT TTG TGT ACT TTT ACC TTC AAT, 
15 TCA AAT AAT AAA GAG TAC and 

ATG GCC TGG TTT ACT CCA GTA GGC CTT GCT GTT GAC ACC, 

CCA. 

7. Plasmid HAV-6 containing the nucleic 

20 acid sequences 

ATG GGA AGG TCT CAT TTT TTG TGT ACT TTT ACC TTC AAT, 
TCA AAT AAT AAA GAG TAC and 

ATG GCC TGG TTT ACT CCA GTA GGC CTT GCT GTT GAC ACC , 



CCA. 
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